Skip to content

[AutoParallel] Enhance processmesh #72052

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 28, 2025

Conversation

xuxinyi389
Copy link
Contributor

@xuxinyi389 xuxinyi389 commented Apr 3, 2025

PR Category

Auto Parallel

PR Types

Improvements

Description

card-73263

  1. Processmesh 支持get_group方法,支持转换为对应的 Group
  2. ProcessMesh 支持 get_submesh_with_dim方法,返回对应“dim”维度的通信SubMesh
  3. ProcessMesh 索引方法新增支持“str”类型,本质上调用 get_submesh_with_dim

Example:

mesh_2d = dist.ProcessMesh([[0, 1, 2, 3], [4, 5, 6, 7]], dim_names=["dp", "tp"])
dp_mesh = mesh_2d["dp"]
tp_mesh = mesh_2d["tp"]

Calling mesh_2d["dp"] on rank 0, 4 returns a 1D submesh of DeviceMesh:([0, 4]).
Calling mesh_2d["dp"] on rank 1, 5 returns a 1D submesh of DeviceMesh:([1, 5]).
Calling mesh_2d["dp"] on rank 2, 6 returns a 1D submesh of DeviceMesh:([2, 6]).
Calling mesh_2d["dp"] on rank 3, 7 returns a 1D submesh of DeviceMesh:([3, 7]).
Calling mesh_2d["tp"] on rank 0, 1, 2, 3 returns a 1D submesh of DeviceMesh:([0, 1, 2, 3]).
Calling mesh_2d["tp"] on rank 4, 5, 6, 7 returns a 1D submesh of DeviceMesh:([4, 5, 6, 7]).

Copy link

paddle-bot bot commented Apr 3, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@xuxinyi389 xuxinyi389 force-pushed the enhance_processmesh branch from f6b4054 to b589534 Compare April 8, 2025 07:50
Copy link

paddle-ci-bot bot commented Apr 16, 2025

Sorry to inform you that b589534's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

@xuxinyi389 xuxinyi389 changed the title processmesh support convert group [AutoParallel] Enhance processmesh Apr 22, 2025
@xuxinyi389 xuxinyi389 force-pushed the enhance_processmesh branch from fd92afc to 38f9e95 Compare April 24, 2025 06:29
@xuxinyi389
Copy link
Contributor Author

/re-run approval

return ProcessMesh(new_mesh, new_dim_names)

def get_submesh_with_dim(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个方法和get_mesh_with_dim有什么区别?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_mesh_with_dim 只是对 mesh 的一个简单重排,如mesh.get_mesh_with_dim(“dp”)只是把mesh的dp维放在最外维,并没有减少mesh内process_ids。mesh.get_submesh_with_dim("dp")则是获取包含当前rank的dp通信组的submesh。比如说:mesh_2d = dist.ProcessMesh([[0, 1, 2, 3], [4, 5, 6, 7]], dim_names=["dp", "tp"])
dp_mesh = mesh_2d.get_submesh_with_dim("dp")
on rank 0, 4 returns a 1D submesh of ProcessMesh:([0, 4]).
on rank 1, 5 returns a 1D submesh of ProcessMesh:([1, 5]).
on rank 2, 6 returns a 1D submesh of ProcessMesh:([2, 6]).
on rank 3, 7 returns a 1D submesh of ProcessMesh:([3, 7]).

@From00 From00 merged commit 6a0f5ce into PaddlePaddle:develop Apr 28, 2025
42 of 44 checks passed
YqGe585 pushed a commit to YqGe585/Paddle that referenced this pull request May 7, 2025
* processmesh support convert group

* add_test

* fix_test

* fix_en_docs

* move_test

* fix_bugs_of_get_mesh_with_dim
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants